Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a first cut attempt at this. It needs proper tests, which I am willing to add once we have agreement on the idea.
@jdolitsky this is what we discussed over Slack.
Problem
The DecompressStore has support for determining that a particular blob is tarred and then untarring it. This is great, but it assumes that the tar contains only one file. It even ignores the filename:
One could argue about whether we should or should not support tar with multiple files in it, but as long as tar supports it, people will use it, break this, and we will have issues.
The heart of the issue is that the standards assume one writer per descriptor. But if a descriptor references a tar file (or any other archiving format, but really only tar is used), then it might need multiple writers.
Solution
This PR is a proposed solution. It creates two kinds of "untar writer" - one that has a single writer passed through, and another that has a "multi-writer" passed through. The multi-writer creates multiple writers, one for each filename, and untarwriter picks which one based on the filename.
It continues to have a regular
Writer
, so that non-tar layers can be handled.The usage would be something like:
where
multiStore
implementsMultiWriterIngester
:Calling
.Writers()
returns a map of filename tocontent.Writer
, which is passed toUntarWriter
, which uses the appropriate one.Alternate
I did consider an alternate approach, one where, instead of returning
map[string]Writer
, it returned a func that can be called to get the writer:But I honestly couldn't see how the complexity added value.
Looking forward to feedback.